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Abstract 


We consider graphs that represent pairwise 
marginal independencies amongst a set of vari¬ 
ables (for instance, the zero entries of a covari¬ 
ance matrix for normal data). We characterize the 
directed acyclic graphs (DAGs) that faithfully ex¬ 
plain a given set of independencies, and derive al¬ 
gorithms to efficiently enumerate such structures. 
Our results map out the space of faithful causal 
models for a given set of pairwise marginal inde¬ 
pendence relations. This allows us to show the 
extent to which causal inference is possible with¬ 
out using conditional independence tests. 


1 INTRODUCTION 


DAGs and other graphical models encode conditional in¬ 
dependence (Cl) relationships in probability distributions. 
Therefore, Cl tests are a natural building block of algo¬ 
rithms that infer such models fro m data. For example, the 
PC al gorithm for learni ng DAGs jKalisch and Biihlmannl 


2007n and the FCl (ISpirtes et al.L 120001) and RFCI 


( Colombo et al.Ll2012 ) algorithms for learning maximal an¬ 
cestral graphs are all based on Cl tests. 

Cl testing is still an ongoing research topic, to wh i ch the 


UAl community i s contributing (e.g. IZhang et al.L 1201 1 


Doran et al.L l2014l) . But at least for continuous variables. 
Cl testing will always remain more difficult than test- 
ing marginal ind ependence for quite fundamental reasons 
( Bergsma , 20041) . Intuitively, the difficulty is that two vari¬ 
ables X and y could be dependent “almost nowhere”, e.g., 
for only a few values of the conditioning variable z. This 
suggests a two-staged approach to structure learning: first 
try to learn as much as possible from simpler independence 
tests before applying Cl tests. Here, we present a theoret¬ 
ical basis for extracting as much information as possible 
from the simplest kind of stochastic independence - pair¬ 
wise marginal independence. 





Figure 1: (a) A marginal independence graph lA whose 
missing edges represent pairwise marginal independencies, 
(b) A faithful DAG Q entailing the same set of pairwise 
marginal independencies as U. (c) A graph for which no 
such faithful DAG exists. 


More precisely, we will consider the following problem. 
We are given the set of pairwise marginal independencies 
that hold amongst some variables of interest. Such sets 
can be represented as graphs whose missing edges corre¬ 
spond to independencies (Figure [T^). We call such graphs 
marginal independence graphs. We wish to find DAGs on 
the same variables that entail exactly the given set of pair¬ 
wise marginal independencies (Figure [TJ)). We call such 
DAGs faithful. Sometimes no such DAGs exist (e.g., Fig- 
ure[T];). Else, we are interested in finding the set of all faith¬ 
ful DAGs, hoping that this set will be substantially smaller 
than the set of all possible DAGs on the same variables. 
Those candidate DAGs could then be probed further by us¬ 
ing joint marginal or conditional independence tests. 


Other authors have represe nted marginal (in)depen d encies 
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covariances. 


Our r esults generalize the work of I Pearl and Wermuth 
(119941) who showed (but did not prove) how to find some 
faithful DAGs for a given covariance graph. We review 
these and other connections to related work in Section [3 
where we also link our problem to the theory of partially 
ordered sets (posets). This connection allows us to iden¬ 
tify certain maximal and minimal faithful DAGs. Based 
on these “boundary DAGs” we then derive a characteri¬ 
zation of all faithful DAGs (Section |4|i, and construct re¬ 
lated enumeration algorithms (Section|5]). We use these al¬ 
gorithms to explore the combinatorial structure of faithful 
DAG models (Section |6]l which leads, among other things, 
to a quantihcation of how much pairwise marginal inde¬ 
pendencies reduce structural causal uncertainty. Finally, 
we ask what happens when a set of independencies can not 
be explained by any DAG. How many additional variables 
will we need? We prove that this problem is NP-hard (Sec¬ 
tion |7]i. 

Preliminary versions of many of the results presented in 
this paper were obtained in t he Master’s thesis of the sec¬ 
ond author ( Idelberg^ 2014 ). 


2 PRELIMINARIES 


In this paper we use the abbreviation ijf for the connective 
“if and only if”. A graph Q = {V, E) consists of a set 
of nodes (variables) V and set of edges E. We consider 
undirected graphs (which we simply refer to as graphs), 
directed graphs, and mixed graphs that can have both undi¬ 
rected edges (denotes asx — y) and directed edges (denoted 
as a; y). Two nodes are adjacent if they are linked by 
any edge. A clique in a graph is a node set C C f/ such 
that all u, u S C are adjacent. Conversely, an independent 
set is a node set / C V in which no two nodes u,v G I 
are adjacent. A maximal clique is a clique for which no 
proper superset of nodes is also a clique. For any v G V, 
the neighborhood N{v) is the set of nodes adjacent to v 
and the boundary Bd{v) is the neighborhood of v including 
V, i.e. Bd{v) = N{v) U {u}. A node v is called simpli- 
cial if Bd{v) is a clique. Equivalentl y, v is simpl i cial if f 
Bd(z;) C Bd(w) for all w G N{v) ( Kloks et all 2000l) . 
A clique that contains simplicial nodes is called a simplex. 
Every simplex is a maximal clique, and every simplicial 
node belongs to exactly one simplex. The degree d{v) of 
a node v is |A^(u)|. If for two graphs Q — {V,E{Q)) and 
0' — {V,E{Q')) we have E{Q) C E{Q'), then Q is an 
edge subgraph of Q' and Q' is an edge supergraph of Q. 
The skeleton of a directed graph Q is obtained by replacing 
every edge u —u by an undirected edge u — v. 


A path of length n — 1 is a sequence of n distinct nodes in 
which successive nodes are pairwise adjacent. A directed 
path X ^ ^ y consists of directed edges that all point 

towards y. In a directed graph, a node u is an ancestor 


of another node u if u = u or if there is a directed path 
u ^ ^ V. Eor each edge u ^ v, we say that u is 

a parent of v and u is a child of u. If two nodes u, v in 
a directed graph have a common ancestor w (which can 
be u or v), then the path is 

called a trek connecting u and v. A DAG is called transitive 
if, for all u ^ V, it contains an edge u ^ v whenever 
there is a directed path from u to v. Given a DAG Q, the 
transitive closure is the unique transitive graph that implies 
the same ancestor relationships as Q, whereas the transitive 
reduction is the unique edge-minimal graph that implies the 
same ancestor relationships. 


In this paper we encounter several well-known graph 
classes, e.g., chordal graphs and trivially perfect graphs. 
We will give brief dehnitions when appro priate, but we di¬ 
rect th e reader to the excellent survey by [Brandstadt et al. 
(1999) for further details. 


3 SIMPLE MARGINAL INDEPENDENCE 
GRAPHS 


In this section we dehne the class of graphs which can 
be explained using a directed acyclic graph (DAG) on the 
same variables. We will refer to such graphs as simple 
marginal independence graphs (SMIGs). 

Definition 3.1. A graph lA = (V, Eipl)) is called the sim¬ 
ple marginal independence graph (SMIG), or marginal in¬ 
dependence graph of a DAG Q = {V,E{Q)) if for all 
v,w G V, V — w G E{IA) ijf V and w have a common 
ancestor in Q. IflA is the marginal independence graph of 
Q then we also say that Q is faithful to lA. SMIG is the 
set of all graphs lA for which there exists a faithful DAG 
Q. Note that each DAG has exactly one marginal indepen¬ 
dence graph. 


Again, we point out that marginal independence graphs are 
often called (and drawn as) bidirected graphs in the liter¬ 
ature, though the term “marginal inde pendence graph” has 
also been used by various authors (e.g. lTan et al.Ll2014i) . 


3.1 SMIGs and Dependency Models 


In this subsection we recall briefly the g eneral setting for 
model ing (in)dependencies proposed by IPearl and Verma 
(119871) and show the relationship between that model and 
SMIGs. In the definitions below V denotes a set of vari¬ 
ables and X, Y and Z are three disjoint subsets of V. 

Definition 3.2 ( Pearl and Vermal ( 1987l) I. A dependency 
model A4 over V is any subset of triplets (X, Z, Y) which 
represent independencies, that is, (X,Z,Y) G A4 asserts 
that X is independent of Y given Z. 


A probabilistic dependency model M. p is defined in terms 
of a probability distribution P over V. By definition 



























{X, Z,Y) € Mp ijf for any instantiation x, y and z of 
the variables in these subsets P{x \ y z) = P{x \ z). 


A directed acyclic graph dependency model AAg is defined 
in terms of a DAG Q. By definition (X, Z,Y) G Mq iff 
X and Y are d-separated by Z in Q (for a defini tion of 
d-separation by a set Z see Pearl and VermA \l98'A ) ). 


We define a marginal dependency model, resp. marginal 
probabilis tic and marginal DAG d ependency model, analo¬ 
gously as IPearl and Vermal (Il987h with the restriction that 
the second component of any triple (X, Z,Y) is the empty 
set. Thus, such marginal dependency models are sets of 
pairs (X, Y). It is easy to see that the following properties 
are satisfied. 


Lemma 3.3. Let M be a marginal probabilistic depen¬ 
dency model or a marginal DAG dependency model. Then 
M. is closed under: 

Symmetry: (X, X) G A4 0^,X) G A4 and 
Decomposition: {X,Y U W) G Xf => {X,Y) G Xi. 
Moreover, if A4 is a marginal DAG dependency model then 
it is also closed under 

Union: (X, Y), (X, W) G M ^ (X, X U IX) G XI. 


The marginal probabilistic dependency model is not closed 
under union in general. For instance, consider two inde¬ 
pendent, uniformly distributed binary variables y and w 
and let a; = y © tu, where © denotes xor of two bits. For 
the model Mp defined in terms of probability over x, y, w 
we have that ({a;}, {y}) and ({a;}, {ru}) belong to Xdp but 
({a;}, {y, w}) does not. 

In this paper we will not assume that the marginal inde¬ 
pendencies in the data are closed under union. Instead, we 
only consider pairwise independencies, which we formal¬ 
ize as follows. 

Definition 3.4. Let M be a marginal probabilistic depen¬ 
dency model over V. Then the simple marginal indepen¬ 
dence graph Li = {V,E{IA)) of M is the graph in which 
x-yG E{U) iff{{x}, {y}) (/, M. 


Thus, in general, marginal independence graphs do not 
contain any information on higher-order joint independen¬ 
cies present in the data. However, under certain com¬ 
mon parametric assumptions, dependency models would 
be closed under union as well. This holds, for instance, 
if the data are normally distributed. In that case, marginal 
independence is equivalent to zero covariance, pairwise in¬ 
dependence implies joint independence, and marginal in¬ 
dependence graphs become covariance graphs. 

The following is not difficult to see. 

Proposition 3.5. A marginal dependency model M which 
is closed under symmetry, decomposition, and union coin¬ 
cides with the transitive closure of {({x},{y}) : x,y G 
X} n Xt over symmetry and union. 


This Proposition entails that if the marginal dependencies 


in the data are closed under these properties, then the entire 
marginal dependency model is represented by the marginal 
independence graph. 

3.2 SMIGs and Partially Ordered Sets 

To reach our aim of a complete and constructive character¬ 
ization of the DAGs faithful to a given SMIG, it is useful 
to observe that marginal independence graphs are invariant 
with respect to the insertion or deletion of transitive edges 
from the DAG. We formalize this as follows. 

Definition 3.6. A (labelled) poset V is a DAG that is iden¬ 
tical to its transitive closure. 

Proposition 3.7. The marginal independence graphs of a 
DAG Q and its transitive closure P{Q) are identical. 

Proof. Two nodes are not adjacent in the marginal inde¬ 
pendence graph iff they have no common ancestor in the 
DAG. Transitive edges do not influence ancestral relation¬ 
ships. □ 

We thus restrict our attention to finding posets that are faith¬ 
ful to a given SMIG. Note that faithful DAGs can then be 
obtained by deleting transitive edges from faithful posets; 
since no DAG obtained in this way can be an edge sub¬ 
graph of two different posets, this construction is unique 
and well-defined. In particular, by deleting all transitive 
edges from a poset, we obtain a sparse graphical represen¬ 
tation of the poset as defined below. 

Definition 3.8. Given a poset P = (X, E), its transitive 
reduction is the unique DAG Qp = {V,E') for which 
V{G) = V and E' is the smallest set where E' C E. 

Transitive reductions are also known as Hasse diagrams, 
though Hasse diagrams are usually unlabeled. Different 
posets can have the same marginal independence graphs, 
e.g. the posets with Hasse diagrams Vi = x ^ y ^ z 
and V 2 = X y ^ z. Similarly, Markov equivalence is a 
sufficient but not necessary condition to inducing the same 
marginal independence graphs (adding an edge x ^ z to 
V 2 changes the poset and the Markov equivalence class, but 
not the marginal independence graph). 


3.3 Recognizing SMIGs 


We first recall existing results that show which graphs ad¬ 
mit a faithful DAG at all, and how to find such DAGs if 
possible. Note that many of these results have been stated 
without proof ( Pearl and Wermuth , 19941) . but our connec¬ 
tion to posets will make some of these proofs straightfor¬ 
ward. The following notion related to posets is required. 


Defin ition 3.9 (Bound graph dMcMorris and Zaslavskvl 
Il982l) '). For a poset V = (X, E), the bound graph B = 
{V,E') ofV is the graph where x — y G E' iff x and y 
share a lower bound, i.e., have a common ancestor in V. 
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Figure 2; Relation between chordal graphs, trivially perfect 
graphs, and SMIG. In graph theory, S MIG is known as 
thedass of (upper/lower) bound graphs ( Cheston and Janl 
20061 ) . 


Theorem 3.10. SMIG is the set of all graphs for which 
every edge is contained in a simplex. 


Proof This is Theorem 2 in Pearl and WermuthI ( 1994 ) 
(who referred to simplexes as “exterior cliques”). Alter¬ 
natively, we can observe that the marginal independence 
graph W of a poset V (Definition B.lb is equal to its bound 
graph (Definition 13.9b . The characterization of bound 
graphs as “edge simplic i al” gr aphs has been proven by 
McMorris and Zaslavsky! ( 1982 ) by noting that simplicial 


nodes in U correspond to possible minimal elements in V. 
We note that this result pre dates the equivalent statement in 
Pearl and Wermuth (Il994 ). □ 


Though all bound graphs have a faithful poset, not all 
bound graphs have one with the same skeleton; see Fig¬ 
ure for a counterexample. However, the graphs for 
which a poset with the same skeleton can be found are 
nicely characterizable in terms of forbidden subgraphs. 


Theorem 3.11 ( Pearl and Wermuth ( 19941) ). Given a 
graph lA, a DAG Q that is faithful to Li and has the same 
skeleton exists iff Li is trivially perfect (i.e., Li has no 
Pi= - nor a Ci= I I as induced subgraph). 


It is known that the trivially perfect graphs are the intersec- 
tion of the bound graph s and the chordal graphs (Figure |2j 
ICheston and Japll2006l) . 

This nice result begs the question whether a similar char¬ 
acterization is also possible for SMIG. As the following 
observation shows, that is not the case. 

Proposition 3.12. Every graph lA is an induced subgraph 
of some graph Li' G SMIG. 


Proof. Take any graph lA = {V,E) and construct a new 
graph Li' as follows. For every edge e = u — v inlA, add 
a new node Ve to V and add edges Ve — u and Ug — v. 
Obviously LA is an induced subgraph of Li'. To see that Li' 
is in SMIG, consider the DAG Q consisting of the nodes 
in Li' and the edges v Ve ^ u and for each newly added 


node in Li'. Then LA is the marginal independence graph of 

G. □ 


The graph class characterization implies efficient recogni¬ 
tion algorithms for SMIGs. 

Theorem 3.13. It can be tested in polynomial time whether 
a graph LA is a SMIG. 


Proof. Verifying the graphical condition of Theorem 13. 101 
amounts to testing whether all edges reside within a sim¬ 
plex. However, knowing that SMIGs are bound graphs, we 
can apply an efficient algorithm for bound graph recog¬ 
nition that uses radix sort and si mplex elimination and 
achie ves a runtime of 0{n + sm) (ISkowrohska and Svslo . 


19841) . where s < n is the number of simplexes in the 
graph. This is typically better than 0(n^) because large 
m implies small s and vice versa. Alternatively, we can 
apply known fast algorithms to find all simplicial nodes 
( Kloks et al. . 2000l) . □ 


4 FINDING FAITHFUL POSETS 


We now ask how to find faithful DAGs for simple marginal 
independence graphs. We observed that marginal inde¬ 
pendence graphs cannot distinguish between transitively 
equivalent DAGs, so a perhaps more natural question is; 
which posets are faithful to a given graph? As pointed out 
before, we can obtain all DAGs from faithful posets in a 
unique manner by removing transitive edges. A further ad¬ 
vantage of the poset representation will turn out to be that 
the “smallest” and “largest” faithful posets can be charac¬ 
terized uniquely (up to isomorphism); as we shall also see, 
this is not as easy for DAGs, except for marginal indepen¬ 
dence graphs in a certain subclass. 


4.1 Maximal F aithful Posets 


Our first aim is to characterize the “upper bound” of the 
faithful set. That is, we wish to identify those posets for 
which no edge supergraph is a lso faithful. We will show 
that a construction described bv IPearl and WermuthI (119941) 
solves exactly this problem. 


Definition 4.1. For a graphlA = {V,E{Li)), f/ie sink graph 
S{LA) = {V,E{S{Ll))) is constructed as follows: for each 
edge u — V in LA, add to E{S{Li)): (1) an edge u ^ v if 
Bd(u) C Bd(v); (2) an edge u v ifBd(u) D Bd{v); (3) 
an edge u — v ifBdfa) — Bd{v). 


For instance, the sink graph of the graph in Figure[T^ is the 
graph in Figure [T]^. 


Definition 4.2 JPearl and WermuthI ( 19941) ). A sink orien¬ 
tation of a graph Li is any DAG obtained by replacing every 
undirected edge of S {Li) by a directed edge. 


We first need to state the following. 














































Lemma 4.3. Every sink orientation ofU is a poset. 


Proof. Fix a sink orientation Q and consider any chain 
X ^ y ^ z. By construction, this implies that Bd(a;) C 
Bd(z). Hence, if x and z are adjacent in the sink graph, 
then the only possible orientation is cc —>• z. There can 
be two reasons why x and z are not adjacent in the sink 
graph: (1) They are not adjacent in U. But then Q would 
not be faithful, since Q implies the edge x — z. (2) The 
edge was not added to the sink graph. But this contradicts 
Bd(a;) C Bd(z). □ 

to strengthen Theorem 2 by 
in the sense that we can replace 
“DAG” by “maximal poset” (emphasized): 

Theorem 4.4. V is a maximal poset faithful to U iffV is a 
sink orientation oflA. 

The following is also not hard to see. 

Lemma 4.5. For a SMIG 14, every DAG Q that is faithful 
to lA is a subgraph of some sink orientation of 14. 

Proof. Obviously the skeleton of Q cannot contain edges 
that are not in U. So, suppose x —y is an edge in Q but 
conflicts with the sink orientation; that is, the sink graph 
contains the edge ?/ —> x. That is the case only if (y) 
is a proper subset of 'Qdu{x). However, in the marginal 
independence graph of Q, any node that is adjacent to x 
(has a common ancestor) must also be adjacent to y. Thus, 
the marginal independence graph of Q cannot be 14. □ 


This Lemma allows us 


Pearl and Wermuth (1994 


Every maximal faithful poset for 14 can be generated by 
first fixing a topological ordering of <S and then generat¬ 
ing the DAG that correspo nds to that ordering, an i dea tha t 
has also been mentioned bv lDrton and Richardsonl(l2008ah . 
This construction makes it obvious that all maximal faithful 
posets are isomorphic. 

For curiosity of the reader, we note that S{14) can 
also be viewed as a complete partially directed acyclic 
graph (CPDAG), which represents the Markov equiv¬ 
alence class of edge-maximal DAGs that are faithful 
with lA. CPDAG s are used in the context of inferring 
DAGs from data dSpirtes et al. . 2000l: Chickering , 2003t 
Kalisch and Biihlmannl 2007h . which is only possible up 
to Markov equivalence. 


4.2 Minimal Faithful Posets 

A minimal faithful poset to lA is one from which no further 
relations can be deleted without entailing more indepen¬ 
dencies than are given by lA. 

Definition 4.6. Let lA = {V, E) be a graph and let I QV 
be an independent set. Then Iff is the poset consisting of 
the nodes in I, their neighbors in lA, and directed edges 
i —^ j for each i, j where j € N(i). 


(a) (b) (c) (d) 

Figure 3: (a) A graphs with three simplicial nodes I (open 
circles), (b) Its unique minimal faithful poset 7^^. (c,d)The 
unique faithful DAGs with minimum (c) or maximum (d) 
numbers of edges. 


For example, Figurej^^ shows the unique lu for the graph 
in Figure!^. 

Theorem 4.7. Let 14 = {V,E) G 14. Then a poset V is a 
minimal poset faithful to 14 iff V = lu for a set I consist¬ 
ing of one simplicial vertex for each simplex. 

Proof. We first show that if / is a set consisting of one sim¬ 
plicial node for each simplex, then is a minimal faithful 
poset. Every edge e S EiJA) resides in a simplex, so it is 
either adjacent to I or both of its endpoints are adjacent to 
some i £ /. In both cases, lu implies e. Also lu does not 
imply more edges than are in lA. Now, suppose we delete an 
edge i ^ X from V- This edge must exist in lA, else i was 
not simplicial. But now lu no longer implies this edge. 
Thus, Iff is minimal. Second, assume that 7^ is a mini¬ 
mal faithful poset. Assume V would contain a sequence 
of two directed edges x ^ y ^ z. Then V would also 
contain the edge x ^ z. But then y ^ z could be deleted 
from V without changing the dependency graph, and V was 
not minimal. So, V does not contain any directed path of 
length more than 1. Next, observe that for each simplex in 
lA, the nodes must all have a common ancestor in V. With¬ 
out paths of length > 1, this is only possible if one node i 
in the simplex is a parent of all other nodes, and there are 
no edges among the child nodes of i. Einally, each such i 
must be a simplicial node in IA\ otherwise, it would reside 
in two or more simplexes, and would have to be the unique 
parent in those simplexes. But then the children of i would 
form a single simplex in lA. □ 

Like the maximal posets, all minimal posets are thus iso¬ 
morphic. We point out that the minimal posets contain no 
transitive edges and therefore, they are also edge-minimal 
faithful DAGs. However, this does not imply that min¬ 
imal posets have the smallest possible number of edges 
amongst all faithful DAGs (Figure [2l. There appears to 
be no straightforward characterization of the DAGs with 
the smallest number of edges for marginal independence 
graphs in general. However, a beautiful one exists for the 
subclass of trivially perfect graphs. 

Definition 4.8. A tree poset is a poset whose transitive re¬ 
duction is a tree (with edges pointing towards the root). 

Theorem 4.9. A connected SMIG lA has a faithful tree 
poset iff it is trivially perfect. 


























Proof. The bound gr aph of a tree poset is i dentical to its 
comparability graph ( Brandstadt et all 1999li . which is the 
skeleton of the poset. Comparability graphs of tre e posets 
coincide with trivially perfect graphs (IWolkLll965h . □ 


Since no connected graph on n nodes can have fewer edges 
than the transitive reduction of a tree poset on the same 
nodes (i.e., n — 1), tree posets coincide with faithful DAGs 
having the smallest possible number of edges. 


How do we construct a tree for a given trivially perfect 
graph? Every such graph must have a central point, which 
is a node that is adjacent to all other nodes. We set this node 
as the sink of the tree, and continue recursively with the 
subgraphs obtained after removing the central point. Each 
subgraph is also trivially perfect and can thus be oriented 
into a tree. After we are done, we link the sinks of the trees 
of the su bgraphs to th e original central point to obtain the 
full tree (IWolkLll965h . 


5 FINDING FAITHFUL DAGS 



(b) 


Eigure 4: Example of the procedure in Proposition IS .2l that. 
given a SMIG (a), enumerates all faithful DAGs (b). Eor 
brevity, only the graphs that correspond to a fixed topolog¬ 
ical ordering are displayed. Only one set / (open circles) 
can be chosen in step (1). Thick edges and filled nodes 
highlight the DAG G- Mandatory edges (solid) link I to 
the sources of (7; if any such edge was absent, one of the 
relationships in the poset Iff would be missing. Optional 
edges (dashed) are transitively implied from the mandatory 
ones and G- 


If a given marginal independence graph U admits faithful 
DAG models, then it is of interest to enumerate these. A 
trivial enumeration procedure is the following: start with 
the sink graph of U, choose an arbitrary edge e, and form 
all 2 or 3 subgraphs obtained by keeping e (if it is directed), 
orienting e (if it is undirected), or deleting it. Apply the 
procedure recursively to these subgraphs. During the recur¬ 
sion, do not touch edges that have been previously chosen. 
If the current graph is a DAG that is faithful to 14, output it; 
otherwise, stop the recursion. 

However, we can do better by exploiting the results of the 
previous section, which will allow us to derive enumeration 
algorithms that generate representations of multiple DAGs 
at each step. 

5.1 EnumerationofFaithful DAGs 

Having characterized the maximal and minimal faithful 
posets, we are now ready to construct an enumeration pro¬ 
cedure for all DAGs that are faithful to a given graph. We 
first state the following combination of Theorem 14.41 and 
Theorem l4.7l 

Proposition 5.1. A DAG G = {V,E{G)) is faithful to a 
SMIG U = {V, E (14)) iff (1) G is an edge subgraph of some 
sink orientation of Li and (2) the transitive closure of G is 
an edge supergraph of Iff for some node set I consisting 
of one simplicial node for each simplex. 

Erom this observation, we can derive our first construction 
procedure for faithful DAGs. 

Proposition 5.2. A DAG G is faithful to a SMIG U = 
(y,E{Li)) iff it can be generated by the following steps. 
(1) Pick any set I Q V consisting of one simplicial node 


for each simplex. (2) Generate any DAG on the nodes V\I 
that is an edge subgraph of some sink orientation of 14. (3) 
Add any subset of edges from Iff such that the transitive 
closure of the resulting graph contains all edges of Iff. 

While step (3) may seem ambiguous, Eigure |4] illustrates 
that after step (2), the edges from lu decompose nicely 
into mandatory and optional ones. This means that we can 
in fact stop the construction procedure after step (2) and 
output a “graph pattern”, in which some edges are marked 
as optional. This is helpful in light of the potentially huge 
space of faithful models, because every graph pattern can 
represent an exponential number of DAGs. 

5.2 Enumeration of Faithful Posets 

The DAGs resulting from the procedure in Proposition |52] 
are in general redundant because no care is taken to avoid 
generating transitive edges. By combining Propositions l5.ll 
and l5.21 we obtain an algorithm that generates sparse, non- 
redundant representations of the faithful DAGs. 

Theorem 5.3. A poset V is faithful toU = {V,EiU)) iff 
it can be generated by the following steps. (1) Pick any set 
lev consisting of one simplicial node for each simplex. 
(2) Generate a poset V on the nodes V \ I that is an edge 
subgraph of some sink orientation of 14. (3) Add lu to V. 

A nice feature of this construction is that step (3) is unam¬ 
biguous: every choice for I in step (1) and V in step (2) 
yields exactly one poset. Figure |5] gives an explicit pseu¬ 
docode for an algorithm that uses Theorem 15. 3 1 to enumer¬ 
ate all faithful posets. 

Our algorithm is efficient in the sense that at every inter- 















function FaithfulPosets(Z^ = {V{U),E{U))) 
function LiSTP0SETS(t/, S, R, ly) 
if Q is acyclic and atransitive then 
Output Q U 

if skeleton of 5 C skeleton of S then 

e •<— some edge consistent with E{S) \ R 
ListPosets(C/, 5, i? U {e}, /^) 
E{g)^E{g)iJ{e} 

ListPosets(C/, S,R\J {e}, ly) 

for all node sets I of U consisting of one simplicial 
node per simplex do 
g •«— empty graph on nodes of ViU) \ I 
S ^ sink graph of U on nodes of ViU) \ I 
ListPosets(0, S, 0, /^) 

Figure 5: Enumeration algorithm for faithful posets. 
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11,117 
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1,460 

90 

10 

11,716,571 

7,799 

366 


Table 1: Comparison of the number of unlabeled connected 
graphs with n nodes to the number of such graphs that are 
also SMIGs. For n = 13 (not shown), non-SMIGs outnum¬ 
ber SMIGs by more than 10^ : 1. 



Figure 6; (a) A graph U and its sink graph, (b) Transitive 
reductions of all 6 faithful posets that are generated by Al¬ 
gorithm FaithfulPosets for the input graph (a). 


nal node in its recursion tree, it outputs a faithful poset. 
At every node we need to evaluate whether the current 
g is acyclic and atransitive (i.e., contains no transitive 
edges), which can be done in polynomial time. Also 
simplexes and t heir simplicial ver tices can be found in 
polynomial time Kloks et al. ( 2000h . Thus, our algorithm 
is a polynomial delay enumeration algorithm similar to 
the ones used to enumerate adjustment sets for DAG s 


dTextor and Liskiewiczl 201 ll: van der Zander et al. . 2014 ). 

Figure |6] shows an example output for this algorithm. 


6 EXAMPLE APPLICATIONS 

In this section, we apply the previous results to explore 
some explicit combinatorial properties of SMIGs and their 
faithful DAGs. 


6.1 Counting SMIGs 


We revisit the question; when can a m arginal independence 
graph allow a causal interpretation (IPearl and Wermuth . 


19941) ? More precisely, we ask how many marginal inde¬ 


pendence graphs on n variables are SMIGs. We reformu¬ 
late this question into a version that has been investigated 
in the context of poset theory. Let the height of a poset V 
be the length of a longest path in V. The following is an 
obvious implication of Theorem l4.7l 


Corollary 6.1. The number M(ri) of non-isomorphic 
SMIGs with n nodes is equal to the number of non¬ 
isomorphic posets on n variables of height 1. 


Enumeration of posets is a highly nontrivial problem, and 
an intensively studied one. The online encyclopedia of 
integer sequences (OEIS) tabulates M{n) for n up to 40 
( Wambacm l2015h . We give the first 10 entries of the se¬ 
quence in Table [T] and compare it to the number of graphs 
in general (up to isomorphism). As we observe, the fraction 
of graphs that admit a DAG on the same variables decreases 
swiftly as n increases. 


6.2 Graphs with a Unique Faithful DAG 

Erom a causal inference viewpoint, the best we can hope 
for is a SMIG to which only single, unique DAG is faithful. 

The classical example is the graph-, which for more 

than 3 nodes generalizes to a “star” graph. However, for 
5 or more nodes there are graphs other than the star which 
also induce a single unique DAG. Combining Lemma 1431 
and Theorem l4.7l allows for a simple characterization of all 
such SMIGs. 

Corollary 6.2. A SMIG U with n nodes has a unique faith¬ 
ful DAG iff each of its simplexes contains only one simpli¬ 
cial node and its sink orientation equals Ijff. 

Based on this characterization, we computed the number of 




































n 
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2 

3 

4 

5 

6 

7 

8 
9 

10 


posets with n nodes 
1 
3 

19 

219 

4,231 

130,023 

6,129,859 

431,723,379 

44,511,042,511 

6,611,065,248,783 


faithful to Cn 
1 
2 
9 

76 

1,095 

25,386 

910,161 

49,038,872 

3,885,510,411 

445,110,425,110 


Table 2: Possible labelled posets on n variables before and 
after observing a complete SMIG C„. 


SMIGs with unique DAGs for n up till 9 (Table [T]i. Inter¬ 
estingly, this integer sequence does not seem to correspond 
to any known one. 

6.3 Information Content of a SMIG 

How much information does a marginal independence 
graph contain? Let us denote the number of posets on 
n variables by P{n). After observing a marginal inde¬ 
pendence graph U, the number of models that are still 
faithful to the data reduces to size Pin) — kiU), where 
k{U) < P{n) (indeed, quite often k{U) = P{n) as we 
can see in Table [T]i. Of course, the number k{U) strongly 
depends on the structure of the SMIG U. But even in the 
worst case when U is a complete graph, the space of pos¬ 
sible models is still reduced because not all DAGs entail a 
complete marginal independence graph. 

Thus, the following simple consequence of Theorem 14.71 
helps to derive a worst-case bound on how much a SMIG 
reduces structural uncertainty with respect to the model 
space of posets with n variables. 

Corollary 6.3. The number of faithful posets with respect 
to a complete graph with n nodes is n times the number of 
posets with n — 1 nodes. 

Table|2]lists the number of possible posets before and after 
observing a complete SMIG for up to 10 variables. In this 
sense, at n = 10, the uncertainty is reduced about 15-fold. 

We note that a similar but more technical analysis is possi¬ 
ble for uncertainty reduction with respect to DAGs instead 
of posets. We omit this due to space limitations. 


the DAGs with some auxiliary nodes. We generalize Defi¬ 
nition [34] as follows. 

Definition 7.1. Let U = {V,E{L{)) be a graph and let 
Q, with Q n V — ib, be a set of auxiliary nodes. A DAG 
Q = (V U Q,E(Q)) is faithful to U if for all v,w € V, 
V — w G E(U) iffy and w have a common ancestor in Q. 

The result below follows immediately from Proposi¬ 
tion |3T2| 

Proposition 7.2. For every graph Li there exists a faithful 
DAG lA with some auxiliary nodes. 

Obviously, if 77 G SMIG then there exists a faithful DAG 
to U with (5 = 0. For U SMIG, from the proof of 
Proposition 13. 121 it follows that there exists a set Q of at 
most \E{Ll)\ nodes and a DAG Q such that Q is faithful 
to U with auxiliary nodes Q. But the problem arises to 
minimize the cardinality of Q. 

Theorem 7.3. The problem to decide if for a given graph 
Li and an integer k, there exists a faithful DAG with at most 
k auxiliary nodes, is NP-complete. 


Proof. It is easy to see that the problem is in NR To prove 
that it is NP-hard, we show a polynomial time reduction 
from the edge clique cove r problem, that is known to be 
NP-complete ( Kami 1972). Recall that the problem edge 
clique cover is to decide if for a graph U and an integer k 
there exist a set of k subgraphs of IT, such that each sub¬ 
graph is a clique and each edge of 14 is contained in at least 
one of these subgraphs? 


Let IT = (y, E) and k be an instance of the edge clique 
cover problem, with V = {ui, ... ,Vn}. We construct the 
marginal independence graph 14' as follows. Let W = 
Then V{14') = V U W and E{U') = 
E U {vi — Wi : i = 1, ... ,n}. Obviously, 14' can be 
constructed from 14 in polynomial time. We claim that 
14 = (y, E) can be covered by < fc cliques iff for Li' there 
exists a faithful DAG Q with at most k auxiliary nodes. 


Assume first that 14 — {V, E) can be covered by at most k 
cliques, let us say Ci,..., Ck', with k' < k. Then we can 
construct a faithful DAG G for 14' with k' auxiliary nodes 
as follows. Its set of nodes is V(G) = V U W U Q, where 
Q = {< 71 , ... ,qk>}. The edges E{G) can be defined as 

{wi Vi : i = 1,... ,n}U v : v G Cj}. 

j 


1 MODELS WITH LATENT VARIABLES 


In this section we consider situations in which a graph U 
is not a SMIG (which can be detected using the algorithm 
in Theoreml3.13b. Sim ilarly to the definition proposed in 


Pearl and Vermal (119871) for the general dependency mod¬ 


els, to obtain faithful DAGs for such graphs we will extend 


It is easy to see that G is faithful to W. 

Now assume that a DAG G, with at most k auxiliary nodes 
Q, is faithful to 14'. From the construction of 14' it follows 
that for all different nodes Vi, Vj G V there is no directed 
path from Vi to vj in G- If such a path exists, then Vi is an 
ancestor of Vj in G- Since Vi—Wi is an edge of 14', the nodes 
Vi and Wi have a common ancestor in G, which must be also 












a common ancestor of Wi and Vj - a contradiction because 
Wi and Vj are not incident in U'. Thus, all treks connecting 
pairs of nodes from in ^ must contain auxiliary nodes. 

Next, we slightly modify Q: for each Wi we remove all in¬ 
cident edges and add the new edge Wi Vi. The resulting 
graph Q', is a DAG which remains faithful to W. Indeed, 
we cannot obtain a directed cycle in the Q' since no Wi has 
an in-edge and the original Q was a DAG. To see that the 
obtained DAG remains faithful to U' note first that after 
the modifications, Wi and Vi have a common ancestor in Q 
whereas Wi and Vj, with i ^ j, do not. Otherwise, it would 
imply a directed path from Vi to vj since Wi is the only 
possible ancestor of both nodes - a contradiction. Finally, 
note that any trek connecting Vi and vj in Q cannot contain 
a node from W. Similarly, no trek between Vi and vj in 
Q' contains a node from W. We get that Vi and Vj have a 
common ancestor in Q iff they have a common ancestor in 

G'. 

Thus, in G' the auxiliary nodes Q are incident to V, but not 
to nodes from W. Below we modify G' further and obtain 
a DAG G”, in which every auxiliary node is incident with 
a node in V via an out-edge only. To this aim we remove 
from G' all edges going out from a node in F to a node 
in Q. 

Obviously, if Vi and Vj have a common ancestor in G", 
then they also have a common ancestor in G', because 
E{G") C E{G'). The opposite direction follows from the 
fact we have shown at the beginning of this proof that for 
all different nodes Vi ,Vj €V there is no directed path from 
Vi to Vj in G- This is true also for G'- Thus, if Vi and 
Vj have a common ancestor, say x, in G' then x G Q and 
there exist directed paths x ^ yi ^ . .yr ^ Vi and 

X ^ y[ —)• ...y', Vj such that also all yi,... ,yr and 
y[,... ,yl., belong to Q. But from the construction of G" it 
follows that both paths belong also to G"- 

Since G" is faithful to U, for every auxiliary node Q the 
subgraph induced by its children Ch{Q) n F in G" is a 
clique in 14'. Moreover every edge Vi — Vj of the graph 
14 belongs to at least one such clique. Thus the subgraphs 
induced by Ch{qi) n V,..., Ch{q}^i) fl V, with k' < k, are 
cliques that cover U. □ 

8 DISCUSSION 

Given a graph that represents a set of pairwise marginal 
independencies, which causal structures on the same vari¬ 
ables might have generated this graph? Here we character¬ 
ized all these structures, or alternatively, all maximal and 
minimal ones. Furthermore, we have shown that it is possi¬ 
ble to deduce how many exogenous variables (which corre¬ 
spond to simplicial nodes) the causal structure might have, 
and even to tell whether it might be a tree. For graphs that 
do not admit a DAG on the same variables, we have studied 


the problem of explaining the data with as few additional 
variables as possible, and proved it to be NP-hard. This 
may be surprising; the related problem of finding a mixed 
graph that is Markov equivalent to a bidirected graph and 
has a s few bidirected edges as poss ible is efficiently solv¬ 
able (IDrton and RichardsonLl2008al) . 


The connection to posets emphasizes that sets of faithful 
DAGs have complex combinatorics. Indeed, if there are 
no pairwise independent variables then we obtain the clas¬ 


sical poset enumeration problem (iBrinkmann and McKav , 
20021) . Our current, unoptimized implementation of the al¬ 
gorithm in Figure |5] allows us to deal with dense graphs up 
to about 12 nodes (sparse graphs are easier to deal with). 
We point out that our enumeration algorithms operate with 
a “template graph”, i.e., the sink orientation. It is possible 
to incorporate certain kinds of background knowledge, like 
a time-ordering of the variables, into this template graph 
by deleting some edges. Such further constraints could 
greatly reduce the search space. Another additional con¬ 
straint that could be used for l inear models is the preci¬ 
sion m atrix (ICox and WermuthL 1 19931 : IPearl and Wermuth , 
19941) . though finding DAGs that explain a given prec ision 
matrix is NP-hard in general ( Verma and PearlL 1993 ). 


We observed that the pairwise marginal independencies 
substantially reduce structural uncertainty even in the worst 
case (Table [T]i. Causal inference algorithms could ex¬ 
ploit t his to reduce the number of C l tests. The PC algo¬ 
rithm (IKalisch and Buhlmanni l2007h . for instance, forms 
the marginal independence graph as a first stage before per¬ 
forming any Cl tests. At that stage, it could be immediately 
tested if the resulting graph is a SMIG, and if not, the algo¬ 
rithm can terminate as no faithful DAG exists. 


In summary, we have mapped out the space of causal struc¬ 
tures that are faithful to a given set of pairwise marginal 
independencies using constructive criteria that lead to well- 
structured enumeration procedures. The central idea under¬ 
lying our results is that faithful models for marginal inde¬ 
pendencies are better described by posets than by DAGs. 
Our results allow to quantify how much our uncertainty 
about a causal structure is reduced when we invoke the 
faithfulness assumption and observe a set of marginal in¬ 
dependencies. 

It future work, it would be interesting to extend our ap¬ 
proach to small (instead of empty) conditioning sets, which 
would cover cases where we only wish to perform Cl tests 
with low dimensionality. 
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